Search results for "high dimensional data"

showing 4 items of 4 documents

From optimization to algorithmic differentiation: a graph detour

2021

This manuscript highlights the work of the author since he was nominated as "Chargé de Recherche" (research scientist) at Centre national de la recherche scientifique (CNRS) in 2015. In particular, the author shows a thematic and chronological evolution of his research interests:- The first part, following his post-doctoral work, is concerned with the development of new algorithms for non-smooth optimization.- The second part is the heart of his research in 2020. It is focused on the analysis of machine learning methods for graph (signal) processing.- Finally, the third and last part, oriented towards the future, is concerned with (automatic or not) differentiation of algorithms for learnin…

Signaux sur graphesOptimisation convexe[STAT.ML]Statistics [stat]/Machine Learning [stat.ML]High dimensional dataGraph signalsStatistiques en grande dimensionAutomatic differentiation[MATH.MATH-OC] Mathematics [math]/Optimization and Control [math.OC][MATH.MATH-OC]Mathematics [math]/Optimization and Control [math.OC][STAT.ML] Statistics [stat]/Machine Learning [stat.ML]Convex optimizationDifférentiation automatique

researchProduct

Sparse relative risk survival modelling

2016

Cancer survival is thought to closed linked to the genimic constitution of the tumour. Discovering such signatures will be useful in the diagnosis of the patient and may be used for treatment decisions and perhaps even the development of new treatments. However, genomic data are typically noisy and high-dimensional, often outstripping the number included in the study. Regularized survival models have been proposed to deal with such scenary. These methods typically induce sparsity by means of a coincidental match of the geometry of the convex likelihood and (near) non-convex regularizer.

gene expression datarelative risk regression modelsurvival analysisparsityhigh dimensional datadifferential geometrydglarsSettore SECS-S/01 - Statistica

researchProduct

A fast and recursive algorithm for clustering large datasets with k-medians

2012

Clustering with fast algorithms large samples of high dimensional data is an important challenge in computational statistics. Borrowing ideas from MacQueen (1967) who introduced a sequential version of the $k$-means algorithm, a new class of recursive stochastic gradient algorithms designed for the $k$-medians loss criterion is proposed. By their recursive nature, these algorithms are very fast and are well adapted to deal with large samples of data that are allowed to arrive sequentially. It is proved that the stochastic gradient algorithm converges almost surely to the set of stationary points of the underlying loss criterion. A particular attention is paid to the averaged versions, which…

Statistics and ProbabilityClustering high-dimensional dataFOS: Computer and information sciencesMathematical optimizationhigh dimensional dataMachine Learning (stat.ML)02 engineering and technologyStochastic approximation01 natural sciencesStatistics - Computation010104 statistics & probabilityk-medoidsStatistics - Machine Learning[MATH.MATH-ST]Mathematics [math]/Statistics [math.ST]stochastic approximation0202 electrical engineering electronic engineering information engineeringComputational statisticsrecursive estimatorsAlmost surely[ MATH.MATH-ST ] Mathematics [math]/Statistics [math.ST]0101 mathematicsCluster analysisComputation (stat.CO)Mathematicsaveragingk-medoidsRobbins MonroApplied MathematicsEstimator[STAT.TH]Statistics [stat]/Statistics Theory [stat.TH]stochastic gradient[ STAT.TH ] Statistics [stat]/Statistics Theory [stat.TH]MedoidComputational MathematicsComputational Theory and Mathematicsonline clustering020201 artificial intelligence & image processingpartitioning around medoidsAlgorithm

researchProduct

LogDet divergence-based metric learning with triplet constraints and its applications.

2014

How to select and weigh features has always been a difficult problem in many image processing and pattern recognition applications. A data-dependent distance measure can address this problem to a certain extent, and therefore an accurate and efficient metric learning becomes necessary. In this paper, we propose a LogDet divergence-based metric learning with triplet constraints (LDMLT) approach, which can learn Mahalanobis distance metric accurately and efficiently. First of all, we demonstrate the good properties of triplet constraints and apply it in LogDet divergence-based metric learning model. Then, to deal with high-dimensional data, we apply a compressed representation method to learn…

AutomatedData InterpretationBiometryFeature extractionhigh dimensional datametric learningPattern RecognitionFacial recognition systemSensitivity and SpecificityMatrix decompositionPattern Recognition Automatedcompressed representationComputer-AssistedArtificial Intelligencecompressed representation; high dimensional data; LogDet divergence; metric learning; triplet constraint; Artificial Intelligence; Biometry; Data Interpretation Statistical; Face; Humans; Image Enhancement; Image Interpretation Computer-Assisted; Pattern Recognition Automated; Photography; Reproducibility of Results; Sensitivity and Specificity; Algorithms; Facial Expression; Software; Medicine (all); Computer Graphics and Computer-Aided DesignImage Interpretation Computer-AssistedPhotographyHumansDivergence (statistics)Image retrievalImage InterpretationMathematicsMahalanobis distancebusiness.industryLogDet divergenceMedicine (all)Reproducibility of ResultsPattern recognitionStatisticalImage EnhancementComputer Graphics and Computer-Aided DesignFacial ExpressionComputingMethodologies_PATTERNRECOGNITIONComputer Science::Computer Vision and Pattern RecognitionData Interpretation StatisticalFaceMetric (mathematics)Pattern recognition (psychology)Artificial intelligencetriplet constraintbusinessSoftwareAlgorithmsIEEE transactions on image processing : a publication of the IEEE Signal Processing Society

researchProduct